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software Counters for First-Failure Data Capture 



«osed is a method for using counte. in.bedd«i o^.^^^^Lf in^lu'^^r 
ging. By using software counters at vanou^ pomts m^en^^^ Tl,e software counters 

SSiitions. the past and present status of the .^^f - ^ ^^^'^^ ^ abnormal path in 

rerrtrt:!^.^-^^^ 

system activity. This information is never overlaid or lost. 

software counters cause no ^^^^^^^^^^^^ 

fi„.ctioning. When an error -<^'^''^^''^J^.^^tloi6s of solving an error on its first 
determine the cause of the error. This greatly ^e ^^^^ debug- 

occurrence. At worst, the system programmer wdl ^f^^ ^J^^ traces for an undeter- 
oing the problem to determine the next step without havmg to run vano 
Sd amount of time and wait for the problem to recur. 

.^bmty. Tii.U^ fcciUty «qu.» a '"^o"^ ^ N..wo,k (LAN) adapt»s. this 
Z^::^^^'^^'^^^^^^ r«,.%n^c. tap., or 

trace. 

u tv.^ Catherine of iiseful information on the first 

A major problem on many systems the ^^enugot ^^^^^^ ^ 

occurrence of a failure. This is known as first ^^^"^ f ^^^^^^ Therefore, on the 

nm during 'normal' production penods because of the perforaiance mip 
first occurrence of an error, no information is available. 

A second problem is the wrapping of the trace uUe ^^^-'b^^- 
problems are mtennittent and the initial faUure is - -^^^f^t^t^^^^^^ trace table has 
Le a user realizes the problem has occurred and tnes to ave theu 
wrapped and all relevant information about the error IS lost. This is especi 

systems like LAN servers and adapter microcode. 

1. Tn the 3172 software counters were used for debugging when the per- 
Softwaie counter example - In the 3172. sonwarc occurring. The specific fimcUon 

formance of trace was too degradmg and m^^ ^^j^^^^^^"^" transr^ssion and reception of 
in the 3172 where tiiese software countej "^^^^"^^^^ 
frames of the Fiber Distributed Data Interface (FDDI) media. 

I «j fiTKt steo was the notification by the 

In the transmit flow, four steps were mvoWed. Jh. ^ st^P ^ 

HOST operating system a frame was ^^y'^ }'^^l^^l^:ry. The tiiird step was the 
moving of data from HOST memory to the P^DI sha^ m ^ fourth 
moving of the data from tiie FDDI shared memory to the FDDI ctups 
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S ftware Counters f r Rrst-Failure Data Capture — Continued 

and final step was to notiJy the FDDI chipset a frame was ready to be transmitted. At each of 
these steps it was possible for the HOST fi^me to be lost because of various reasons ranging from 
lack of storage to FDDI chipset errors. A software counter was used at each step to track the 
progress of the fi^me. For all possible error conditions in the transmit flow, a software counter 
was also used. These software counters gave the abihty to continuously monitor the system 
activity. At any one point in time, it was easy to deteraiine if an abnormal number of frames 
were being lost, data errors were occurring or the system was performing normally. 

The receive flow was also performed in four basic steps. As with the transmit flow, the 
received frame could be discarded or lost at any one of the steps because of an abnormal error 
condition. Software counters were placed at each stage of the received process and at all 
abnormal paths. 

After fully implementing the use of software counters throughout FDDI microcode, they 
were used for debugging any problems which were masked by the trace utility. The use of system 
trace was no longer necessary. The software counter immediately identified the code path which 
was in error. 

The advantages of software counters follow: 

Continuously Running: The software coimters are compiled in line and are incremented each 
time the code section is executed. Periodic *snap shots'' of the system can be taken to monitor 
the overall system activity and performance. There is no user interaction which can result in the 
wrong data being captured. 

No performance Degradation: Unlike system traces, the software counters induce no degradation 
to the system performance. No file writes or text conversion is necessary. A simple increment of 
a memory variable is all the occurs. 

First Failure Data Capture: Whenever an erroi: occurs, the software counter information is 
present and can be extracted to check for any abnormal counter increments. Also, all events are 
being tracked and many errors occur when abnormal code paths are executed. By having the 
software counters, it is always possible to determine which abnormal code path was executed. 
With system traces, it is far too expensive to run a trace to cover all possible paths. For many 
errors, the system programmer never has any idea what the root cause of an error was. Some- 
times special traces and traps must be written if system traces fail to capture the cause of the 
error. Software counters eliminate the system programmer time spent detenmning the root cause 
of an error. This allows him to spend more usefiol time with the customer. 

System and Unit test verification: By using software counters, various test cases can be verified to 
execute specific code paths. The problem with many test cases is not knowing if all various code 
paths are tested. With software coxmters present, it is obvious if a code path was executed or not. 

Small Area of memory required: Each software counter is a 32 bit counter which requires 4 bytes 
of memory. To implement 256 specific coimters only requires a IK block of memory. Most 
useful trace tables are over 32K in size. Many other require a large file to be allocated on a disk 
which adds greatly to the performance impact of the trace. 
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— Continued 



Below are disadvantages of the software counters: 

To fully utilize the information made available by the software counters, a programmer 
must have a good understanding of the codes function and flow. Many counters can be used to 
track the flow of data through a task. If the programmer is not familiar with the specific task, the 
software counters will not, yield any usefid information. 

When new counters need to be added, the code must be recompiled and the structure vari- 
able which contains the coimters must be changed. This requires a new level of code to be sent 
when a new counter is added. However, the same basic problem exists with system traces today. 
When a new trace event needs to be added, the code must be recompiled. 
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